Sequential Hierarchical Pattern Clustering

نویسندگان

  • Bassam Farran
  • Amirthalingam Ramanan
  • Mahesan Niranjan
چکیده

Clustering is a widely used unsupervised data analysis technique in machine learning. However, a common requirement amongst many existing clustering methods is that all pairwise distances between patterns must be computed in advance. This makes it computationally expensive and difficult to cope with large scale data used in several applications, such as in bioinformatics. In this paper we propose a novel sequential hierarchical clustering technique that initially builds a hierarchical tree from a small fraction of the entire data, while the remaining data is processed sequentially and the tree adapted constructively. Preliminary results using this approach show that the quality of the clusters obtained does not degrade while reducing the computational needs.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel Algorithms for Hierarchical Clustering and Cluster Validity

This correspondence proposes parallel algorithms on SIMD machines for hierarchical clustering and cluster validity computation. The machine model uses a parallel memory system and an alignment network to facilitate parallel access of both pattern matrix and proximity matrix. For a problem with N patterns, the number of memory accesses is reduced from 0 ( N 3 ) on a sequential machine to 0 ( N 2...

متن کامل

Does Fundraising Have Meaningful Sequential Patterns? The Case of Fintech Startups

Nowadays, fundraising is one of the most important issues for both Fintech investors and startups. The pattern of fundraising in terms of “number and type of rounds and stages needed” are important. The diverse features and factors that could stem from Fintech business models which can influence success are of the key issues in shaping these patterns. This study applied the top 100 KPMG Fintech...

متن کامل

Extracting straight lines by sequential fuzzy clustering

In clustering line segments into a straight line, threshold-based methods such as hierarchical clustering are often used. The line segments comprising a straight line often get misaligned due to noise. Thresholdbased methods have di culty clustering such line segments. A new cluster extraction method is proposed to cope with this problem. This method extracts fuzzy clusters one by one using mat...

متن کامل

Discriminating Subsequence Discovery for Sequence Clustering

In this paper, we explore the discriminating subsequencebased clustering problem. First, several effective optimization techniques are proposed to accelerate the sequence mining process and a new algorithm, CONTOUR, is developed to efficiently and directly mine a subset of discriminating frequent subsequences which can be used to cluster the input sequences. Second, an accurate hierarchical clu...

متن کامل

Sequential Pattern Mining of Social Networks

A social network is a graph made of social entities such as individuals, corporations, collective social units and organizations linked by interdependencies such as kinships, friendships, common interests, co-authorships, beliefs, prestiges or financial exchanges. Social network analysis and minings search implicit, previously unknown and potentially useful relationships from social networks. M...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009